[Perf][Metrics] Use flurry's concurrent hashmap for 5x throughput #2305
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Towards #1740
Changes
RwLock<Hashmap>
for aggregating measurements.RwLock
brings in significant amount of contention even for concurrent readsflurry
crate's concurrent Hashmap instead of aRwLock<Hashmap>
.The performance gains are huge!
Stress Tests results:
Machine details:
OS: Ubuntu 22.04.4 LTS (5.15.153.1-microsoft-standard-WSL2)
Hardware: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz, 16vCPUs,
RAM: 64.0 GB
Benchmarks
Note for reviewers
This PR is not meant for merging as-is. It's meant to show that we can utilize a more efficient concurrent data structure to our advantage. If we indeed decide to use
flurry
's Hashmap, we have to address the following:Ord
implementations forKeyValue
.Ord
implementation for the Hashmap's key type is a requirement fromflurry
. For this PR, I have added a very basic implementation just to unblock myself from testing the Hashmap.Merge requirement checklist
CHANGELOG.md
files updated for non-trivial, user-facing changes